12 research outputs found

    SOM-VAE: Interpretable Discrete Representation Learning on Time Series

    Full text link
    High-dimensional time series are common in many domains. Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations. However, most representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose a new representation learning framework building on ideas from interpretable discrete dimensionality reduction and deep generative modeling. This framework allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. We introduce a new way to overcome the non-differentiability in discrete representation learning and present a gradient-based version of the traditional self-organizing map algorithm that is more performant than the original. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the representation space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model in terms of clustering performance and interpretability on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application on the eICU data set. Our learned representations compare favorably with competitor methods and facilitate downstream tasks on the real world data.Comment: Accepted for publication at the Seventh International Conference on Learning Representations (ICLR 2019

    Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families

    Get PDF
    We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities where classical HMC is not an option due to intractable gradients, KMC adaptively learns the target's gradient structure by fitting an exponential family model in a Reproducing Kernel Hilbert Space. Computational costs are reduced by two novel efficient approximations to this gradient. While being asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and offers substantial mixing improvements over state-of-the-art gradient free samplers. We support our claims with experimental studies on both toy and real-world applications, including Approximate Bayesian Computation and exact-approximate MCMC.Comment: 20 pages, 7 figure

    Kernel methods for Monte Carlo

    Get PDF
    This thesis investigates the use of reproducing kernel Hilbert spaces (RKHS) in the context of Monte Carlo algorithms. The work proceeds in three main themes. Adaptive Monte Carlo proposals: We introduce and study two adaptive Markov chain Monte Carlo (MCMC) algorithms to sample from target distributions with non-linear support and intractable gradients. Our algorithms, generalisations of random walk Metropolis and Hamiltonian Monte Carlo, adaptively learn local covariance and gradient structure respectively, by modelling past samples in an RKHS. We further show how to embed these methods into the sequential Monte Carlo framework. Efficient and principled score estimation: We propose methods for fitting an RKHS exponential family model that work by fitting the gradient of the log density, the score, thus avoiding the need to compute a normalization constant. While the problem is of general interest, here we focus on its embedding into the adaptive MCMC context from above. We improve the computational efficiency of an earlier solution with two novel fast approximation schemes without guarantees, and a low-rank, Nyström-like solution. The latter retains the consistency and convergence rates of the exact solution, at lower computational cost. Goodness-of-fit testing: We propose a non-parametric statistical test for goodness-of-fit. The measure is a divergence constructed via Stein's method using functions from an RKHS. We derive a statistical test, both for i.i.d. and non-i.i.d. samples, and apply the test to quantifying convergence of approximate MCMC methods, statistical model criticism, and evaluating accuracy in non-parametric score estimation

    Sparse Gaussian Processes on Discrete Domains

    No full text
    Kernel methods on discrete domains have shown great promise for many challenging data types, for instance, biological sequence data and molecular structure data. Scalable kernel methods like Support Vector Machines may offer good predictive performances but do not intrinsically provide uncertainty estimates. In contrast, probabilistic kernel methods like Gaussian Processes offer uncertainty estimates in addition to good predictive performance but fall short in terms of scalability. While the scalability of Gaussian processes can be improved using sparse inducing point approximations, the selection of these inducing points remains challenging. We explore different techniques for selecting inducing points on discrete domains, including greedy selection, determinantal point processes, and simulated annealing. We find that simulated annealing, which can select inducing points that are not in the training set, can perform competitively with support vector machines and full Gaussian processes on synthetic data, as well as on challenging real-world DNA sequence data.ISSN:2169-353

    Kernel Adaptive Metropolis-Hastings

    No full text
    A Kernel Adaptive Metropolis-Hastings algo-rithm is introduced, for the purpose of sampling from a target distribution with strongly nonlin-ear support. The algorithm embeds the trajec-tory of the Markov chain into a reproducing ker-nel Hilbert space (RKHS), such that the fea-ture space covariance of the samples informs the choice of proposal. The procedure is com-putationally efficient and straightforward to im-plement, since the RKHS moves can be inte-grated out analytically: our proposal distribu-tion in the original space is a normal distribution whose mean and covariance depend on where the current sample lies in the support of the tar-get distribution, and adapts to its local covari-ance structure. Furthermore, the procedure re-quires neither gradients nor any other higher or-der information about the target, making it par-ticularly attractive for contexts such as Pseudo-Marginal MCMC. Kernel Adaptive Metropolis-Hastings outperforms competing fixed and adap-tive samplers on multivariate, highly nonlinear target distributions, arising in both real-world and synthetic examples. 1

    The Biomarker S100B and Physical Activity: Implications for Sports-Related Concussion Management

    No full text
    OBJECTIVE: Elevated levels of the astroglial protein S100B have been shown to predict sport-related concussion. However, S100B levels within an athlete can vary depending on the type of physical activity (PA) engaged in and the methodologic approach used to measure them. Thus, appropriate reference values in the diagnosis of concussed athletes remain undefined. The purpose of our systematic literature review was to provide an overview of the current literature examining S100B measurement in the context of PA. The overall goal is to improve the use of the biomarker S100B in the context of sport-related concussion management. DATA SOURCES: PubMed, SciVerse Scopus, SPORTDiscus, CINAHL, and Cochrane. STUDY SELECTION: We selected articles that contained (1) research studies focusing exclusively on humans in which (2) either PA was used as an intervention or the test participants or athletes were involved in PA and (3) S100B was measured as a dependent variable. DATA EXTRACTION: We identified 24 articles. Study variations included the mode of PA used as an intervention, sample types, sample-processing procedures, and analytic techniques. DATA SYNTHESIS: Given the nonuniformity of the analytical methods used and the data samples collected, as well as differences in the types of PA investigated, we were not able to determine a single consistent reference value of S100B in the context of PA. Thus, a clear distinction between a concussed athlete and a healthy athlete based solely on the existing S100B cutoff value of 0.1 μg/L remains unclear. However, because of its high sensitivity and excellent negative predictive value, S100B measurement seems to have the potential to be a diagnostic adjunct for concussion in sports settings. We recommend that the interpretation of S100B values be based on congruent study designs to ensure measurement reliability and validity
    corecore